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O.O ABSTRACT 



Several methods of extending the PDP~11 instruction 
set are discussed. Coding ccanparisons are made. 
Subject to the trivial weighting scheme used, two 
solutions were excluded from further analysis 
because of their poor performance. The "multiply/ 
divide" subsolution as discussed in sections 4.4 
amd 5.4 was the best performer. 
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I'O INTRODUCTION 

A mora elaborate version of the PDP-11/20 is 
considered as a possible candidate for the 
PDP-K. It is felt that if the PDP-K is a 
member of the PDP-11 family, substantial 
gains could be obtained from: 

1.1 Upwards Program Compatibility 

For DEC this would mean a lower total 
software investment, and new machines 
could be introduced mojre easily as 
present PDP-11 software would run on 
PDP-K. 

For custOBiers this would mean that they 
could move to a larger machine without 
the direct need for reprogranming . 

1.2 Peripheral Compatibility 

Only one line of peripheral devices has 
to be built. The introductions of a 
new machine could be done more easily 
for this reason. Any new peripheral 
device would be available for the %4iole 
fanily. 
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^'^ PROBLEMS IN ADAPTING THE PDP-11 ARCHITECTURE TO 
A BIGGER MACHINE 

•^o important problems of the PDP-11 have to be 
solved in order to meet the PDP-K requirements. 

2.1 Limited niiraber of instructions and limited 
amount of opcode space left. For the pdp-k 
three more classes of instructions are 
cotisidered: 

2.1.1. EAE instructions, i.e., rotate/ 
shift and multiply/divide for 
16-bit words. 

2.1.2 Double Precision Integer Arithaetic 
instructicms. 

2.1.3 Floating Point Arithnetic Znstraetions. 

2.2 Limited Address Space 

The total amount of addressable core memory 
on the PDP-11/20 is 65K (IK * is 1024) bytes, 
or 32k 16-bit words. For a big 32-bit 
version of the PIMP- 11 this would only mean 
16K 32-bit words could be addressed, which 
is certainly not adequate for sucSi a 
machine. 
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3-0 PURPOSE OF MEMORANDUM 

Tha purpose of this memorandum is to examine the 
suggested methods of solving the first problems: 
extending the basic PDP-11 instruction set. An 
acceptable solution, subject to several constraints, 
will be sought. 

3.1 Prograw coiapatibility at least on the assentoly 
lanpiage level. * 

3.2 Sii^plicity in progr«»i«King by minimizing the 
nuadber of instxuction formats and restrictions 
imposed on instructions , 

* 3.3 Opcode s{»acs left for future expansion. 

3.4 Opcodes of the largest ioeBA>er of the family 

have to fit in the added instructicxi set, thus 
ffiinisiziiig the nundMuc of fonats, and making 
progrnwlng easier. 
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4.0 POSSIBLE SOLUTIONS 

Four possible solutions to the opcode apace problem are 
shewn below. They are followed by a discussion in 
section 5;0. 



4.1 



4.2 



Implement new instructions as "pure stack" 
instructions (i.e., zero address). Each new 
instruction can now be specified with one 
combination out of 2^^. This allows for hundreds 
of new instructions. Any binary operation (like 
multiply, divide, etc.), would take the two 
operands from the top of the stack, and leave the 
result on the top of the stack. Register 6 
would be used as the implied stacdc pointer. 

Introduce a flag to indicate that the remainder 
of the word containing the flag (note: remainder 
can be » 0) amd the next word form a new 
instruction. Depending on the length of the 
flag, two cases exist. • 

4.2,1 Full Word Flag 



Instruction 

-^- 



Word N 



Word N + 2 



16~Bit Flag 
4.2.2 Pairtial Word Flag 

Instruction 



Kew Instruction 



Word Kf 



Word N -I- 2 



Flag 



■v^ 



New Instruction 



The advantage of this t««^hnique is that 
the new inatrv.cticans can have the sane 
source-destination format as the standard 
(i.e., currrent H)P-il/20), instructions. 



-6- 



The disadvantage is that every new 
instruction takes two words. The 
partial word flag case offers the 
advantage of a greater number of 
new instructions at the expense of 
somewhat more complicated hardware. 



4 . 3 Modes 



A mode is a (hardware) state of the processor 
to allow instructions to be interpreted 
differently. Basically two kinds of modes 
have to be recognized: 

4.3.1 Enter and leave medes only with dedicated 
coojmands {i.e., only switch modes when 

an instruction specifies to do so) . 

4.3.2 Enter nodes for & specified number of 
instructions after which the mode is 
switched back to the standard mode 
automatically. 

The advantage of modes is that instructions 
in any mode are only 1 word long. The 
disadvantage is that special in«tructions have 
to be given to enter, and in the case of 4.3.1, 
to leave the mode. 

4.4 Use Reserved Multiply/jivide Spec* 

These two opcode spaces are not used in the 
PDP-11/20. The to-be-added two-operand 
instructions can be implenented as source- 
destination instructions where the stacdc is 
one implied operand, and the second operand 
is specified with the full 6-bit destination 
field of the instruction. One of these 6 
bits can be used as a direction bit such that 
operaticMRd can have either their source or 
destination as the implied stack. This allovFs 
for 32 nmw instructions to be specified. 
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5 • EVALUATION OF PROPOSED SOLUTIONS 



When evaluating the proposed solutions, the implementation 
of a 32-bit version of the PDP~11 should be includod. Fcr 
such a machine, double-precision floating point instructions, 
together with EAE instructions, operating on 32-bit 
registers are desirable, (assuming that these instructioits 
can operate on registers) . This means that opcode space 
for those instructions has to be reserved to provide for 
their efficient operation. 

Simplicity in programming and machine organization dictate 
that the nurnber of instruction formats for the three 
classes of new instructions, (as discussed in section 
2.1), should be minimal, in order to wake the extended 
instruction set oiore acceptable, it is very desirable to 
make the added instructions fit in currently existing 
formats, or add at most a single new format- Several 
coding comparisons are done to assist in the evaluation > 
The five problems below (P1-P5) , are considered 
representative. The assiunptions made in coding the 
problems can be deduct from the listed code in Appendixes 
A-D. The variables A, B, C, D and E are considered 
single precision floating point (32-bit nuii*>ers) - 

P3: A-* B*C /siMple case 

P2: A4 (B+C)*(b+E) /temporary variable case 

P3: A(i)#-B(i)*c(i) /subscripted case 

P4: A(i)*~B(i+3) *CCi*5) /mixed arithmetic case 

P5s A(i,j)^A(i,j)+B(i,k)*c(k,3) /«ttltl-di«ensional 

array case 

P5 is an exa«ple of the inner- loop statement of the 
array multiplicaticm? IKl-it— &! * &i it is assumed 
that the array bounds are declared fron o to u. For 
array B this would bes Real Array B (0 - bul, - bu2) . 
The first index of B goes to bul» the second to bu2. 
It will be assumed that the indexes axe ia registers 
Rl* Rj« and Rk. 

Assuming that the indexes i and j are in register Ri 
and Rj, the value B (i,j) will be address as follows: 
Location of B(i,j) * location of B (i.e., starting 
location of matrix) + i*b«l+j) . 
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5.1 Pure Stack Operations 

In order to make the pure stack operations efficient, 
one of the opcode spaces reserved for Bvuitiply/divide 
has to be used for a double move (movd' instructions. 

MOVDsMove 2 words (32 bitr) from S(ou:ca) D (estination) 
This instruction is required especia/ly in a 32-bit 
machine. The one binary opcode spaca left can be 
used to implement the EAE instructicajs . ^ "phe - 
instruction format would be as follows: 



OPERATION 



DESTINATION 



mzj 



REGISTER 

This same format is xxamA for t/*e JSR (subroutine call) 
instruction. The EAE instructions are made t» operate 
on registars only. The regis&er involved is 
aprnqitimd by the 3 "register'* bits. 

The value of tlie effective oddresg of the "destination" 
decerwines the nuad>«r of pcvsitions to be shifted or 
rotated. Because the autO'increnent and auto- 
d«cr«aent modes do not apply to th«»e instructions, 
one of the 2 mode bits can be used to specify a 
single or coodbined operation, (i.e., see PDP-lO 
LSH, ZSBC, etc.). The reisjaining sp^suce can be 
used to implement instnic^,^ioiui like EXCHANGE, 
REPEAT, etc. 

Appendix A gives the coding exampleft for th« five 
probi«aas. The handling of multi-dimansional arrays 
is very cumbersome because the addreiis computations 
hav* to be done on itie stack. Introducing a 
seccnd set of 16~bit multiply/divide instructions 
impieaented as the above miE iitstructions will 
solve this pcoblws at the expwiso of a more complex 
instruction set. Subcoliaan TabU 1 MPD of Section 
6 shmrs the ia^rovement gained by this. 



^Except for 16-bit multiply/divide 
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5.2 Flagged Instructione 

The coding examples shown in Appfjfndix B are the 
sarae for alternatives 4.2.1 and 4.2.2. 4.2.2 Is 
preferable only if the additional opcode space is 
ne«;ded. It is suggessted that the EAE multiply/ 
divide instruction vill be implemented in the 
apace "reserved" for them. The EAE rotate/shift 
instructions have to be iiapleroei ted as "flagged" 
instructions, the format would ta similar to that 
discussed in section 5.1., excejt for the flag. 
The double precision integer anc floating point 
instructions would be implemented as full so'x ce- 
destination instructions. 

5 . 3 Modes 

Before going into detail, propotal 4.3.2 (s.tttirg 
the nodes for a specific nunbttr sf instructions 
(K)), will be examined. This is considerec! leas 
attractive because of problens airising in * 
string of N^ instructions to be *acecuted Li the 
new iaod«. 

5.3.1 Branaiiag in terms of skipping ov« a 
group of instructions in the speci;;ied 
string will cause problens because K is 
not updated auto»»tically. 

5.3.2 Progranming irlll be v«ry difficult because 
trihtta branching into a sequence of 
inatrttctions their mode, (in whidb those 
operate), will be difficult to determine. 

5.3.3 It will be difficult tor a oQ»pll«r to 
set up the right "K" b«»usa it will 
require some kind of "look-ahead**. 

5.3.4 In case of interrupts/traps, thm remainder 
o£ H ham to be saved and r«atored upon 
•xit of tha interrupt/trap aervict 
routine. 



T*here N is an arbitrary positive nua*>«r. 
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For the reasons above, proposal 4.3.2 will be 
dropped, and not considered further. 

The extended mode, (vrttich contains the floating, 
double-precision integer instructions, etc.'', 
is entered by the ccwMtiand Enter Extended Mode 
(EEM) . The processor stays in this mode until 
the instruction Leave Extended Mode (LEM) is 
given. 

In regard to 4.3.1, subroutine calls and 
inter rrupt/ traps cause problens typical for modes 
in saving/restoring the mode and entering the 
routine (subroutine or interrupt/ trap service 
routine^ , in the correct mode. The interrupt/ 
trap case is the easiest one. The mode. can be' 
preserved in a dedicated bit in the Central 
Processor Status Register (PS) . Entering the 
interrupt/ trap service routine in the right mode 
can be done similarly by storing the mode of 
that routine in the PS interrupt/trap vector. 
The correct mode will then be entered 
autfxiatically upon interrupt. 

Entering a subroutine in the desired mode in a 
program compatible way can be done by taking the 
lowest bit (bit 0) of the subroutine address as 
the mode bit. In the current PDP-il/20, this 
bit has to be equal zero because the subroutine 
addresrs is a word address. By defining a "0" 
in bit of the subroutine address as the 
standard mode, • program conpatibility ia preserved. 

Saving/restoring the mode upon a stibroutine call/ 
exit is much more difficult. The only hardware 
solution found thus far is to store the mode on 
the stack in a separate word. The new JSR would 
then store 2 words (x\ thm stack: the register 
to be saved and the mode. Programs making use 
of the knowledge that only 1 ««ord gets stored 
on the stated by a JSR have to be modified. 

A program conpatible softwsure solution to the 
mode problem is to have the called subroutine 
take care of the mode handling by restoring the 
mode (upon exit) « which existed prior to the 
call of the subroutine. A possible way of doing 
this ia by having the existing mode, prior to 
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ItL tl ^"^ * ^^''®" subroutine, fixed, such 
that the subroutine only has to match the mode 
upon exxt to the existing (fixed) raode at cTll 

It is suggested that the multiply and divide 
instruction, (operating on ufbit^teglrlr, 

Sf e'^L^d'Sr^^:!^^^^^"^ ^^ i^Pl.n«.nted in 

Appendix c shows the coding examples, rhmv 
suggest that an instruction to enter the 
axt«idad mode for a single instruction is very 
useful The coluion EEMl (Enter Extended Mode 
for 1 instruction) , of Table I, Section 6. 
snows this. 

5.4 Use MutI iply/Divide Space 

^L^'f^^* f^ ^^^ ""^^^^ "P*^«« ^^* to be 
used to unplewmt the EAE instructions as 

described ia section 5.1, The remaining 

instructions have to implemented with the 

st*^ •• »R iaplied operand as discussed in 

I!!^^ -f®"^^**^ example, are given in 
App«idix D. They show, like the "pure stack- 
c««», that handling sttlti-dimeasional arrays 
xs awiHHr«3iie. vhm improveiRents made by 
adding a .et of 16-bit multiply/divide 
instructxoos, as su^g^eted in section 5.1, 
are shown in .ubcoluam Mfo of f^i* i 
Section 6. ' 
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€> - COMPARISON OF PROPOSED SOLITTIONS 

Table 1 shows the results of the five problems for the 
seven3 proposed solutions. Four quantifiers are used 
for each problem to measure the quality of the solutions. 

6.1 The Muinber of Instructions 

It is quite well )uK>%m that the probability of 
making a programming error increases more than 
linear with the number of instructions, \apart 
fran their c<»nplexity) , thus a "good" solution 
should have a low number of inatructicms. 

6.2 The Huimber of Words'^ 

This is the number of words needed to core the 
algorithms given in the appendixes. This is 
an important criterixjgic especially on « imall 
Biachine. For a 32>bit> iM&c^ine the nusbers have 

to be divided by 2. 

6.3 The Mvinber of Mewory References 

The mimber of memori^ references both for a 16 
and 32-bit aachine are included in the tables 
because they are isRportant indicators for the 
execution times of the algorithms. Itie 
numbers in Table 1 are derived tmder the 
following assumptions: 

6.3.1 The stack is supposed to be in core- 
Memory. {Section 6.4 discusses the 
results when this assuapticm is not 
made) . 

6.3.2 For the two operand extended instructions 
the arithntetic unit is supposed to b^ave 
as follom: 1} reads both operands into 
its internal registers; 2) it performs 
the required c^peration (e.g. FMUL, FADD) ? 
and 3) it stores the results back» In 
case of different assumptions the numbers 
in the table can be adjusted accordingly. 

^Posc main solutions, t^ee of which l^ve a sub»olistion. 
^Words are considered to be 16 bita long. 
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Number of Memory References With A Hardware StacK 

The idea is to implement the top M^ words of the 
stack in flip-flop registers. From Table 1 it 
can be seen that the execution speed increases 
for almost all problems and solutions. Those 
solutions making heavy use of the stack gain 
most. 



Pi>r simplicity M ia supposed to be such that la none 
of the problems the stack "overflows" into core. 



TABLE 1 - CODING RESULTS OF PROBLEMS Pi + P5 



XU-'-iERS 



l:~ 



1 



! -> 



QUANTIFIER 



IPURE STACK 



# of Instructions 
it of Words 

# of Memory Ref 
*i cf Memory Ref 

with Hardware 
Stack 



4 
7 



25/12. 5^ 



13/6. 5I 



# of Instructions 
n of Words 

# of Memory R^f 
ir of Memory Ref 

With Hardware 
Stack 



^ of Instructions 
# of Words 
^ of Memory Ref 
« of Memory Ref 

Witli Hardware 

Stack 



8 

13 

51/25.5 



23/11.5 



« of Instructions 
^ of Words 
» of Memory Ref 
# of Memory Ref 

With Hardware 

S tack 



4 
7 
25/12.5 



13/6.5 



# of Instructions 

# of Words 

# of Mt^mory Ref 

# of Monory Ref 
wich f fa re; vat's 



10 
15 
39/22.5 



21/10.5 



21 

28 

74/46 



!'■ 



4 
7 
25.12.5 



13/6.5 



8 

13 
51/25.5 



23/11.5 



4 
7 
25/12.5 



13/6.5 



b 

13 

31/15.5 



19/9.5 



15 
22 
46/23 



2f^/14 



FLAG 



MODE 



2 

8 
18/9 



18/9 



5 

17 

43/21.5 



35/17.5 



2 
8 
18/9 



18/9 



6 

14 

24/12 



24/12 



12 
21 
37/18,5 



20/14,5 



4 
8 
18/9 



18/9 



7 

14 

40/20 



32/16 



4 
8 
18/9 



18/9 



10 
16 

26/13 



26/13 



18 
24 
40/20 



32/16 



EEMl 



4 
8 
18/9 



18/9 



7 

14 

40/20 



32/16 



4 
8 
18/9 



18/9 



8 

14 

24/12 



24/12 



15 
21 

37/18.5 



29/1. 



multipuy/dividi-; 
mpd' 



3 
6 

20/10 



12/6 



6 

11 
41/20 . 5 



21/10.5 



3 
6 
20/10 



12/6 



8 

13 

31/16.5 



19/9.5 



16 
23 
55/30.5 



n/i5.5 



3 
6 
20/10 



12/6 

6 

11 

41/20.5 



21/10.5 



? 
6 

20/10 



12/6 



7 

12 

26/13 



16/8 



13 
20 

40/20 



</' 



•15- 



Table 2 gives a rating summary of Table 1, the 
rating is frcwn 1 (lowest) , to 7 (highest) . .When 
two solutions have equal rating, they both get 
the same number being the average rating when 
they would not have been equal. 

The problems Pi - P3 are very similar in 
nattire, therefore a suiomarized rating is given 
in the first part of Tjable 2, similar ly» for 
P4 - P5 in the second part of TeOt>le 2. The 
third part of Table 2 is a summary of the 
previous two tables assuming equal weights for 
the two previous groups of problems. Part 4 
of Table 2 is merely the sum of the first two 
quantifiers of the third part.^ For a small 
machine, the number of 'instructions and the 
number of words are the most importemt criteria 
for selacting the best solution. On a bigger 
machine, execution speed is becuraing important. 
Part 5 of Table 2 is such an indicator. Its 
entries are the sums of the first, second, and 
fourth quantifiers of part 3. It is assvuoad 
that on the bigger machine thm top of the 
stack is inplsn^ited in hardware. 



1 



Again here, for sia^licity r«as<ms, equal weights are 
assumed. 





Tf^yE 2 ~ 


RATING SUMMARY OF CODING PROBLEMS 


^ ^ — «_— . 








, , ,., ^ — r "■" ' ' ' — 


— — ■■ — - 


w"^--™™— »*-'«'-«'-^P-'- --""■ 


! PART 


QUANTIFIER 


PURE STACK 


FLAG 


MODE 


1 MULTIPLY/blVIDE 


PKOBLEMS 




MPD 


eemI 




MPD 


1 
PL - .P3 


# of Instructions 

# of Words 

# of Memory Ref 

# of Memory Ref 
With Hardware 
Stack 


1.5 
4.5 

1.5/1.5 

4.5/4.5 


1-5 

4.5 
1.5/1.5 

4.5/4.5 


7 
1 
5/5 

1/1 


3.5 
2.5 
6.5/6.5 

2.5/2.5 


3.5 

2.5 

6 . 5/6 . 5 

2.5/2.5 


5.5 
6.5 
3.5/3.5 

6.5/6.S 


5.5 
6.5 

3.5/3.5 

6.5/6.5 


2 
P4 " P5 


# of Instructions 

# of words 

# of Memory Ref 

# of Memory R«f 
With Hardware 
Stack 


1 

1 
1/1 

2/2 


4.5 

5 

3/3 

6/6 


7 
5 
6.5/6.5 

3.5/3.5 


2 
2 
4.5/4.5 

1/1 • 


4.5 

5 

6.5/6,5 

3 . 5/3 . 5 


3 

3 
2/3 

5/5 


6 
1 

4.5/4,5 

7/7 


3 
Pi " P5 


# of Instructions 

# of words 

# of Memory Ref 

# of Memory Ref 
with Hardware 
Stack 


2.5 

5.5 

2.5/2.5 

6 . 5/6 . 5 


6.0 
9.5 

4.5/4.5 

10,5/10.5 


14 
6 

11.5/11.5 

4.5/4.5 


5.5 
4.5 
11/11 

3c5/3,5 


7.5 
7,5 
13/13 

6/6 


8.5 
9.5 

5,5/5.5 . 

11.5/11.5 


11 » 5 

13.5 
8/8 

« 

13 3/13,. 5 


4 


# of Instructions 
+ Number of 

Words 


8 


3.5.5 


20 


10.0 


15.0 18.0 


25 


•5 


# cf Memory Ref 
With Hardware 

Stack 


14,5 


26 


24,5 


13.5 


21,0 


2§.5 


30.5 




.„.>........ -J 


, 
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7.0 COMCLUSIOM 

Lookinij at ?able 2, part 4 and 5, it can be consluded 
that the aisbsolutions, (i.e., MPD for "pure stack" and 
"inultiply/civide", and SEMl for "mode"), are a big 
improvenwsnc over their "main" solutions. This, 
because of. the improved handling of multi-dimensional 
arrays, t"»e price paid for this, however, is a more 
complex iistruction set (i.e., adding a duplicate 
set of l€-bit multiply /divide instructions to operate 
on regis -er or enter the extended mode for a single 
instruction) . . 

The mail solutions "pure stack" and "mode" have the 
lowest rating and can therefore be excluded from 
furthe- consideration. - 

In order to make a definite commitment to any of the 
roaaining five solutions, more researc^t should be 
3one in determining the weights of the problems and 
weicjhts of the quantifiers. 

Tr<m the results, this far however, the fol louring 
can be said: 

7.1 The "mode" subsolution ha« to look much better^ 
in order to be a candidate because of the 
mode problwis in subroutines. The suggested 
hardweure solution is such that the price of 
storing th« aode on the stadk has to be paid 
AIMAYS. Also, in programs which do not make 
us« of the node* {i.«., all current PDP-11 
software) . For this reason the suggested 
software solution is a better candidate 
because there, the price is only paid when 
modes are used. 



^WheM the proper weights are fouAdi. 



-18- 



7.2 The "flag" sp'iution is advisable only when it 
is expected that the use of U& "flagged" 
instructions {i.e, those of diss 7..1.2 and 
2.1.3 of section 2) is low. 

7.3 The most promising solution thi^ far is the 
"nmltiply/divide" svibsolution. it consistently 
scored highest or second highest 
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APPENDIX A 



PURE STACK CODING EXAMPLES 



Pis \ 



JIOVD 
ilOVD 
I'MUL 
*:CWD 



P2: A. 



F,«)D 

Hcm 

MOrD 

PAO 

mt% 

P3^^ A(ir 

MOV) 
MOV> 

MOV) 

P4: A{i 

MOV 
ADD 
MWtt 

MOV 
IHEOt 

MOV 

MOVD 
FMOL 
MC3V!> 



B*C 

C, - (SP) 
B, - (SP) 

(SP)+,A 

-■{b-k:)*(d+e) 



B, -. (SP) 

C, - (SP) 

D, - iSP) 

E, - (SP) 




C(Ri), - (SP) 
B(Ri), - (SP) 

(SP) +, A(Ri) 

B(i^^3)*C(i*5) 

Ri, Mm 

#3« Rs 

B(R«). - iSW) 
Ri. - (SF) 
#5, - (SP) 



(SP)-^, Rm 
C(R8). - (SF) 

(SP)+, A(Ri) 



/iaov« C to the stack 
/wave B to the stack 
/floating nultiply B*C 
/store result in A 



/floating add B+C 



/floating add D^B 

/floating tBoltiply CD+E)*(B"K:) 



/aeswM index i Is in register Ri 
/eove C(i) to the staclc 



/itmJLm a cora^cb register 



/indlex i*3 foKiaed 



/coapute i*s end leave 1 word result 
on top of stadk 



/store result 
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APPENDIX A (CONT.) 



P5: A(i,j)- 


»-A(i,j)-*-B(i,k)^ 


MOV 


Ri, - (SP) 


MOV 


#bul, - (SP) 


IMUL 




MOV 


(SP)+, Rs 


ADD 


Rk, R» 


MOVD 


B(Rs), - (SP) 


MOV 


Rk, - (SP) 


MOV 


#cul, - (SP) 


IKOL 




K07 


{SP)+, Rs 


ADD 


Ri, Ra 


MOVD . 


C(RS) , -CSP) 


FMUL 


. 


MOV 


Ri, - (SF) 


MOV 


#aul, - (SP) 


IMUL 


. 


MOV 


(SP)+, R» 


An> 


Rj, R» 


mcm> 


A(RB), - (SP) 


FADD 




MOVD 


(SP)+, A(R») 



/Rs c<»itain» index for array B 
/frtjt B(i,k) on stack 



/R3 contains index for array C 



/fe« contains index for array C 



/store result 
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APPENDIX B 

FLAGGED INSTRUCTIONS CODING EXAMPLES 



Pli 



P2! 



MOVD 
FMUL 

A '^- 

MOVD 
FADD 
MOVD 
FADD 
FMUL 



P3: A(i) -^ 

MOVD 
FADD 

P4: A(i) <«fl 

MOV 

ADD 

MOVD 

MCP' 

WJh 

P5s A(i,j) 

MOV 

WJL 

ADD 

MOVD 

MOV 

MUL 

M»> 

FMUL 

flQ>V 

WSL 

ADD 

FADD 



B*C 

B,A 
C,A 



/move B to A 



(b-k:)*(d+e) 



B,A 

C,A 

D,-(SP) 
C, (SP) 
(SP)+,A 

-BCi)*C(i) 

B(Ri), A(Ri) 
C(Ri), A(Ri) 



/a » b-k: now 

/top of the stack is C+D 



/move- B(i) to A{i) 



-B(i+3)*C(1*5) /R« is * »car«tch regiater 



Ri» Rs 

#3, Rs 
B(Rs}, A(Ri) 
Ri, Rs 
#5« RS 
C(R»), A(Ri) 



/in<i«x f6r B{i+3) cx»pttt«d 



/iiMl«x for C(i*5) conputad 



-A(i.j) ♦ BdA) * C(k,j> 



Xi, RS 
4l>ul, its 

Rk, Rs 

B(RS). - (SP) 
Rk, RS 

#CUl. RS 
Rj« Rs 
C(RS). (SP) 
Rl, RS 
«aule RS 
Rj» RS 
(SP) -I-, A(Rs) 



/indsx for B(i,X) conputsd 



/iadsx for C(k.j) conpatad 



/ind«c for A(i.j) cosqwited 
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APPENDIX C 



MODE CODING EXf.MPLES 



?1: 



P2: 



EEM 
MOVD 
PKUL 
LEM 



EEM 

MOVD 

PM>D 

MOVD 

FADD 

FMUJL 

LEM 

Pli: A(i) <«■ 

EOi 
MOVD 

FMOu 
LEM 

MOV 
KDfD 
EEM 
• MOVD 
l£H 
MOV 
MOL 
EEM 
FifOh 
XJSti 

p;.: A(i,j) 

MOV 

MUL 

ADD 

EEM 

JK)VD 

LEM 

MOV 

MUL 

ADO 

EEM 

F'KUL 



B*C 

&,A 
C,A 

-(B+C) * (D-t-E) 



B,A 
C,A 

D,-(SP) 
C, (SP) 
CSP)-t-.A 



B(i)*C(i> 



l(ai). A(Ri) 
C(Ri), A{Ri) 



/enter extended mode 



/leave extended mode 



/enter extended mode 



/ioave «aet«nd«d bkxS* 



•B(t+3)*C(i*5) 

Ri.Rs 
*3« RS 

B(RS)« A(Ri) 

Ri, R« 
#S, RS 

C(Ra), ACRiV 



ACi.j) + B<i.k) * C(lc,j) 



'Rl» Rs 
mml, Ra 
Rk, Rs- 

B(R»). - iSP) 

Rk. Rs * 

#cal, Rs 

C(RS), CSP) 



/index for B(i,k) coeqputad 



/index for C{k,j) coaputed 



p5: Cont. 
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APPENDIX C 

MODE CODING EXAMPLES 



LEM 

MOV Ri» Rs 

MUL #aul, Rs 



ADD 
EEM 
FADD (SP) +, A{RS) 

LEM 



Rj, RS /index for A(i,j) computed 



. ir:< r. ::r;vv-r 



Ky.K:v:.: 



Pl: 



P2; 



P3: 



MOVD 
FMUL 
MOVD 



MOVD 
FADD 
MOVD 
FADD 
FMUL 
MO\-D 

A(i: . 

MOVD 
FMUL 
MOVD 



P4: A(i)-^ 

MOV 

ADD 

MOVD 

MOV 

IMUL 

MOV 

F>:UL 

MO^/D 

P5i A(i,j) 

MOV 

IMUL 

MOV 

ADD 

MOVD 

MOV 

IMUL 

MOV 

ADD 

FMUL 

MCV 

IMUL 

MOV 

ADD 

FADD 

M0\^ 



«,-(SP) 
C, (SP) 
(SP) +,A 



~(B+C)*(D+E) 

B,-(SP) 
C, (SP) 
D,-(SP) 
E, (SP) 

(Si)-t-, (SP) 

(3P)+,A 

-B(i) * C(i) 

B(Ri), - (SP) 
C(Ri) , (SP) 
{SP)+,A(Ri) 

■B(i*3) + C(i*5) 

Ri, Rs 
^3, Rs 
B(Rs), -(SP) 
Ri, -(SP) 
«5, (SP) 

(SP)+, Rs 
C(Rs) . (SP^ 

(SF)+, A(Ri) 



• rr.ove H to the staci: 

/multiply c with top of tV.e .-,ta -v. 

/move result to A 



/index i^-3 in Rs 



/index i*5 in Rs 



-A(i, j) + B(i,k) * Ck, j) 

Rir ~ (SP) 
;ibul, (SP) 
(SP)+, Rs 
Rk, Rs 

B{Rs). - (SP^ 
Rk, - (SP) 
#cul, (SP) 
(SP)+, Rs 
Rj. Rs 
C(RS), (SP) 
Ri. - (SP)' 
faul, (SP*) 
(SP)+,Rs 
Rj, Rs 
A(Rs), (SP) 
(SP)-^-. AfRsl 



/ index for B ( i , k ) compu tod 



/index for C(k,j^ computed 



/index for A(i,j1 ccwnputed 



